|
In statistics, the hypergeometric distribution is the discrete probability distribution generated by picking colored balls at random from an urn without replacement. Various generalizations to this distribution exist for cases where the picking of colored balls is biased so that balls of one color are more likely to be picked than balls of another color. This can be illustrated by the following example. Assume that an opinion poll is conducted by calling random telephone numbers. Unemployed people are more likely to be home and answer the phone than employed people are. Therefore, unemployed respondents are likely to be over-represented in the sample. The probability distribution of employed versus unemployed respondents in a sample of ''n'' respondents can be described as a noncentral hypergeometric distribution. The description of biased urn models is complicated by the fact that there is ''more than one noncentral hypergeometric distribution''. Which distribution you get depends on whether items (e.g. colored balls) are sampled one by one in a manner where there is competition between the items, or they are sampled independently of each other. There is widespread confusion about this fact. The name ''noncentral hypergeometric distribution'' has been used for two different distributions, and several scientists have used the wrong distribution or erroneously believed that the two distributions were identical. The use of the same name for two different distributions has been possible because these two distributions were studied by two different groups of scientists with hardly any contact with each other. Agner Fog (2007, 2008) has suggested that the best way to avoid confusion is to use the name Wallenius' noncentral hypergeometric distribution for the distribution of a biased urn model where a predetermined number of items are drawn one by one in a competitive manner, while the name Fisher's noncentral hypergeometric distribution is used where items are drawn independently of each other, so that the total number of items drawn is known only after the experiment. The names refer to Kenneth Ted Wallenius and R. A. Fisher who were the first to describe the respective distributions. Fisher's noncentral hypergeometric distribution has previously been given the name ''extended hypergeometric distribution'', but this name is rarely used in the scientific literature, except in handbooks that need to distinguish between the two distributions. Some scientists are strongly opposed to using this name. A thorough explanation of the difference between the two noncentral hypergeometric distributions is obviously needed here. ==Wallenius' noncentral hypergeometric distribution== (詳細はurn contains red balls and white balls, totalling balls. balls are drawn at random from the urn one by one without replacement. Each red ball has the weight , and each white ball has the weight . We assume that the probability of taking a particular ball is proportional to its weight. The physical property that determines the odds may be something else than weight, such as size or slipperiness or some other factor, but it is convenient to use the word ''weight'' for the odds parameter. The probability that the first ball picked is red is equal to the weight fraction of red balls: : The probability that the second ball picked is red depends on whether the first ball was red or white. If the first ball was red then the above formula is used with reduced by one. If the first ball was white then the above formula is used with reduced by one. The important fact that distinguishes Wallenius' distribution is that there is competition between the balls. The probability that a particular ball is taken in a particular draw depends not only on its own weight, but also on the total weight of the competing balls that remain in the urn at that moment. And the weight of the competing balls depends on the outcomes of all preceding draws. A multivariate version of Wallenius' distribution is used if there are more than two different colors. The distribution of the balls that are not drawn is a complementary Wallenius' noncentral hypergeometric distribution. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Noncentral hypergeometric distributions」の詳細全文を読む スポンサード リンク
|